Comments for MEDB 5501, Week 7

Two-sample t-test (Independent-samples t-test)

  • Randomized trial
    • Convenience sample
    • Random assignment to treatment, control
    • Measure continuous outcome
  • Cohort design
    • Observe exposed and control subjects
    • No random assignment
    • Measure continuous outcome

Assumptions

  • Group 1, 2 both normally distributed
    • Assessed with histograms, boxplots, Q-Q plots
    • Or rely on Central Limit Theorem
  • Possibly different means, but same variance
    • Assessed with boxplot, descriptive statistics
    • Also Levene’s test (not recommended)
  • Observations are independent
    • Between groups
    • Within groups
    • Assessed qualitatively

Normality

  • Assess each group separately, or
  • Combine after subtracting means
  • Less concern with normality for large sample sizes

Equal variances (homescedascity)

  • Compare the box part of the box plots
    • Look for large disparities only (2 or 3 fold)
  • Calculate and compare the standard deviations
    • Again, large disparities only
  • Levene’s test (not recommended)
    • Too little power for small sample sizes
    • Too much power for large sample sizes
    • Very sensitive to normality assumption

Independence

  • Assessed qualitatively
  • Independence between groups
    • No matching
    • No longitudinal measures
  • Independence within groups
    • No cluster effects
    • No infectious spread

Housing data dictionary, 1 of 5

source: 
  This file was found originally at a website 
  DASL (Data And Story Library) that is no 
  longer available. 

description:  
  The original source describes the data as
  "a random sample of records of resales of 
  homes from Feb 15 to Apr 30, 1993 from the
  files maintained by the Albuquerque Board 
  of Realtors. This type of data is 
  collected by multiple listing agencies in
  many cities and is used by realtors as an
  information base."

Housing data dictionary, 2 of 5

copyright:  
    Unknown. You should be able to use this data for
  individual educational purposes under the Fair Use
  guidelines of U.S. copyright law.

format: 
  delimiter: space
  varnames: first row of data
  missing-value-code: *
  rows: 117
  columns: 8

Housing data dictionary, 3 of 5

vars:
  Price:
    label: Selling price
    unit: dollars
    
  SquareFeet:
    label: Living space
    unit: square feet
    
  AgeYears:
    label: Age of home
    unit: years

Housing data dictionary, 4 of 5

  NumberFeatures:
    label: 
      Home features (dishwasher, refrigerator,
      microwave, disposer, washer, intercom, 
      skylight(s), compactor, dryer, handicap
      fit, cable TV access)
    scale: count
    range: 0 to 11  
    
  Northeast:
    label: Located in northeast sector of city?
    values:
      Yes: 1
      No: 0

Housing data dictionary, 5 of 5

  CustomBuild:
    label: Custom built?
    values:
      Yes: 1
      No: 0
    
  CornerLot:
    label: Corner location?
    values:
      Yes: 1
      No: 0

  Tax:
    label: Yearly property tax
    unit: dollars

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Housing analysis

Sample size calculation

Sample size calculation

Sample size calculation